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ABSTRACT 



o 



\ Context. Our understanding of stellar systems depends on the adopted interpretation of the initial mass function, IMF (p(m). 

Unfortunately, there is not a common interpretation of the IMF, which leads to different methodologies and diverging analysis of 
observational data. 

(~ i ' Aims. We study the correlation between the most massive star that a cluster would host, m„,ax> and its total mass into stars, M, as an 

example where different views of the IMF lead to different results. 

Methods. We assume that the IMF is a probability distribution function and analyze the m,^ax - M correlation within this context. We 
also examine the meaning of the equation used to derive a theoretical M - nimux relationship, N x J.^ (p(m) dm = 1 with N the total 
^ , number of stars in the system, according to different interpretations of the IMF. 

^ ■ Results. We find that only a probabilistic interpretation of the IMF, where stellar masses are identically independent distributed 

random variables, provides a self-consistent result. Neither M nor the total number of stars in the cluster, N, can be used as IMF 
scaling factors. In addition, ra„,ax is a characteristic maximum stellar mass in the cluster, but not the actual maximum stellar mass. A 
{M) - nimax correlation is a natural result of a probabilistic interpretation of the IMF; however, the distribution of observational data in 
^ . the N (or M) - m^ax plane includes a dependence on the distribution of the total number of stars, N (and M), in the system, OjvC^), 

' which is not usually taken into consideration. 

CO ' Conclusions. We conclude that a random sampling IMF is not in contradiction to a possible nimax - M physical law. However, such a 

, law cannot be obtained from IMF algebraic manipulation or included analytically in the IMF functional form. The possible physical 

f~ — . information that would be obtained from the N (or M) - m^-^ correlation is closely linked with the OyviCAl) and <bi^{N) distributions; 

' hence it depends on the star formation process and the assumed definition of stellar cluster 

' Key words, stars: statistics — stars: formation — galaxies: stellar content — methods: data analysis 
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1. Introduction These two distributions are different but closely related to each 

other, as statistics and probability are. Probability deals with pre- 

In recent Uterature, the term initial mass function (IMF) is used ^j^jj^g ^j^^ likelihood of possible events in a system with known 

H to indicate three different types of distnbutions: (1) the distn- propgities; statistics consists in analysing the distribution of real 

. . bution by number of the stellar masses obsei-ved in a particular ^^^^^^ ^j^^ the aim of determining some unknown property 

star ensemble, (2) a normahzed version of ( 1 ), i.e., the frequency ^j^^ ^^^^^^^ ProbabiUty addresses the direct problem, while 

distribution of the stellar masses observed in a particular star en- ^j^jj^jj^^ addresses the inverse problem. In our case, distribution 

semble, and (3) the theoretical probabihty density function <^(m) (3) describes the underiying probability distribution from which 

of the stellai- masses that can bejormedm a generic star en- ^^^jj^. masses can be drawn, while distribution (1) describes an 

semble. In this work, following Scald (11986D, we adopt the third ^^^^^^j ^^^jj^ ^^^^^^ ^^^^ ^j^i^j^ ^-^^^^ ideally, to recover the 

definition and explore some consequences of mixing these defi- parameters of the underiying probability distribution, 
nitions. 

In the following, we leave distribution (2) out of the discus- The relation between the shape of (1) and the shape of (3) 

sion and focus, for simplicity, only on distributions (1) and (3l3. depends crucially on the size of the sample, that is, the number 

of stars TV; when N values are large, the two shapes tend to be 

Seniio^nnf regMejfi to: M.Cerviiio e-mail: mcs@iaa.es similai'. This similarity can mislead one into beheving that (1) 

' However, because distribution (2) is an scaled version of distribu- is just a scaled-up version of (3), with TV being the scale factor, 

tion (1), the conclusions derived from (1) also apply to (2). This would be very wrong since, as explained above, the physi- 



1 



Cervino et al.: The IMF and the m,r 



M statistical correlation 



cal meanings of both distributions are intrinsically different. This 
paper is dedicated to exploring the implications of such differ- 
ence. 

A major drawback of the distribution-by-number view (num- 
ber (1) above) is that the very definition of a stellar sample nec- 
essarily implies some (hidden or explicit) assumption on the star 
formation (SF) process that originated the sample. For example, 
an embedded, open, or globular cluster, an OB associations, and 
so on, are coeval and cospatial samples; field stars, which are 
used to study galaxy structure, are neither coeval nor cospatial; 
the stars in a galaxy that were born at a given time, which are 
a sample suitable for stellar populations studies, are coeval but 
not cospatial. These examples make clear that, when a sample 
is selected, some predefined spatial and time scales are implic- 
itly assumed, and these scales may influence the distribution by 
the number of the stellar masses. Rephrasing Scalo ( 1986), when 
talking about the IMF, we are left in the uncomfortable position 
of having no means to define an empirical sample that corre- 
sponds to a consistent definition of IMF and that can be directly 
related to the theories of SF without introducing major assump- 
tions. 

The probability distribution function (pdf) view (number (3) 
above) is actually an abstraction used to describe the general uni- 
verse of initial masses that a star would have. This interpretation 
implies that we have to use a probability framework in order 
to make a description of the problem and inferences from ob- 
served data sets. One implicit requirement of such an approach 
is that the stellar mass is an identically independent distributed 
(iid) variable, and therefore, any realization of the IMF is a ran- 
dom sampleQ. Within this framework, all the empirical samples 
are included naturally as far as they are particular realizations 
of the theoretical distribution. Although it is possible to include 
conditions representing particular SF scenarios, it is generally 
assumed that the IMF has no memory of the SF event: that is, 
the SF details have no major impact on the IMF itself, although 
they can have an impact on the resulting IMF realization once 
the corresponding conditions are included in the derivation. It 
is a surprising fact that there is no clear observational evidence 
that the IMF varies str ongly and systemat ically as a function of 
different SF scenarios (iBastian et al.ll20l9) . 

Throughout this paper, we consider several pieces of work 
based on a distribution-by-numberinterpretation of the IMF. The 
specific way in which the IMF is represented varies depending 
on the considered paper Some authors assume that the IMF is a 
continuous law that returns, for each mass value, the number of 
stars of that mass; others consider that it returns the number of 
stars in each mass bin. Some assume that the stars are distributed 
in a predefined way and the mass of a star depends on the mass 
of the other stars; others consider that the stars are distributed 
independently from each other. In the following, we give exam- 
ples of this and emphasize the differences between the various 
distribution-by-number interpretations and the pdf view of the 
IMF 

Naturally, the equations involving the IMF depend on the in- 
terpretation of the IMF. More importantly however, the cluster- 
related quantities inferred from manipulations of the IMF are 
interpreted differently according to the initial assumptions. One 
case in which the different views of the IMF lead to dramatically 
diverging interpretations is the modeling of the coiTelation be- 
tween the total stellar mass in a cluster, M, and the mass, mmax, 



of its most massive star, which we investigate in this series of 
papers. 

There are many facets to the study of the Ai - Wmax cor- 
relation. One is the correlation obtained theoretically from ma- 
nipulations of the IMF functional form, which is the subject of 
this paper. Another is the inference of M from partial informa- 
tion of the system. The lack of information makes this inference 
deeply de pendent on the IMF interpretation (this aspect is dis- 
cussed in ICerviflo et al ]|20T1 , hereafter Paper II). A third issue 
is the comparison between theory and observational data. This 
point also depends on the interpretation of the IMF (and is stud- 
ied in Jimenez-Donaire et al 2013 in prep., from now on Paper 
III) 

The structure of the paper is as follows: In Sect.|2]we present 
our basic framework for a probabilistic interpretation of the IMF. 
Section |3] is devoted to analyzing in a probabilistic context the 
meaning of the basic equation commonly used in the litera- 
ture relating M and m^ax- In Sect. |4] we discuss the different 
methodologies and assumptions used by other authors to obtain 
a Al - Wniax correlation. We include a discussion on iid stellar 
masses and on the connection of the IMF with the SF. Finally, 
we briefly discuss the composition of different IMFs to obtain an 
integrated galaxy IMF (IGIMF). Our conclusions are described 
in Sect.|5] 



2. Formal probabilistic formulation 

Let us start by framing the problem in a formal probabilistic 
framework: 

1. The IMF, (pirn) - AN I dm, is a pdf, that provides the proba- 
bility of finding a star in a given mass range by its integration 
in such mass range. The mass limits of the pdf, miow and myp, 
are given by stellar theory and must fulfill J^""" (j){m)dm - 1 ; 
that is, we are certain that any possible star has a mass be- 
tween miow and mup. This is the first fundamental difference 
with respect to the distiibution-by-number interpretation: the 
IMF cannot be arbitrarily normalized to M or N, since it 
does not provide numbers of stars with a given mass but the 
probability for a star to be bom with a given mass indepen- 
dently of how many stars are in the cluster or the cluster total 
mass. In this interpretation of the IMF, there is neither an 
implicit sample nor predefined space or time scales. 
The IMF so defined may have values larger than one, pro- 
vided its integral over any mass range is lower than one. 
This is the second fundamental difference with respect to 
the distribution-by-number interpretation when described in 
terms of frequencies (case 2 in the Introduction) where no 
value larger than one is possible by construction. 
In this pager we use the Kroupa IMF ( Kroupa 20C)ll l2063) as 
used in I Weidner & Kroupal (|2006j3 and subsequent works, 
except for the value of mup which we set equal to I2OM0. 
Although a larger value wo uld probably be more realistic 
according to recent studies (Crowther et al.l 1201 0', see also 
the contributions to the Up2010 conference published by 
iTrever et al.ll201 1|) . this choice is motivated by the fact that 
the mup value of most public stellar tracks used in most mmax 
estimations is I2OM0. In Fig. [1] we show the 0(m) used in 
this paper and the probability for a star of having a mass in 
the range m, m + IMq. 



^ Random sample means that every possible sample has a calculable 
chance of sel ection. This is a require ment of any statistical and proba- 



bilistic study ^Kendall & Stuarlll 19771) . 



We note that IWeidner & Kroupal ( |2004h use g2 = 2 . 30 in their 
parametrization of the IMF and that Weidner & Kroupal ( 120061) use 
0-2 = 2.35. 
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Fig. 1. IMF used in the present work (so lid line), as in the 
param etrization bv lKroupal(l200lll2002h and lWeidner & Kroupal 
(l2006l) . Being a pdf, it can have values larger than one; the prob- 
abilities are given by the integral over the pdf. We also plot the 
probability that a star has a mass in the m, m + IMq range, which 
is lower than one (dashed line). This probability declines rapidly 
when m is larger than niup - IM©. 



The probability for a random star of having a mass lower 
than a given value ma is given by 



p(m < nia) 



-f 



(m) dm. 



(1) 



while the probability for a random star of having a mass 
equal to or larger than m,^ is given by 



p(m > nia) 



-f 



(p(m) dm. 



(2) 



In this work, the integrals over the IMF will always be read 
as equal to or larger than the lower limit and lower than 
the upper limit. The use of lower than instead of equal to or 
lower than in the upper limit and the complementary in the 
lower limit is just a convention. However, equal cannot be 
used simultaneously in both equations: no star can simulta- 
neously belong to two independent intervals. The convention 
we use implies that the nominal value mup cannot be formally 
reached, although values very close to it are possible. 
2. Different observational scenarios can be described by adding 
constraints to the IMF. For instance, we may explicitly in- 
clude the limit imposed on mmax by the total mass of the 
sample we are analyzing, that is, mmax - min{mup. At}. In 
this case, we must define an a posteriori pdf, related to the 
IMF, that includes such a condition: 



(p{m) H(OTn 



• m) 



p(m < mmax) 



(3) 



where H(m^^^-m) is the Heaviside functiorQ, which ensures 
that no star equal to or larger than Wmax can be present in the 
cluster We note that <p(m\m < OTmax) is also a pdf. The mean 
mass of such distribution is 



{m\m < mmax) = 



p(m < m^ax) 



(4) 



More elaborated constrained-IMF can be formulated, always 
keeping in mind that conditions are imposed ad hoc and pro- 
duce a pdf whose functional form differs from (p(m). 
The pdf describing ensembles with a total number of stars 
N (formally conditioned to have N stars) can be calculated 
as successive convolutions of the corresponding pdf for one 
star. For instance, the pdf for the total mass, ^MiM\M), is 
the result of convolving the IMF N times with itse lf (see 
ICervino & Luridianall2006HSelman & Melnickll2008h : 



N 



Om(M\N) - <p(m) (g> cf>(m) » .... (gi (p(m) . 



(5) 



A property of self-convolution is that simple relations link 
the mean value and the high-order moments of ^(m) and 
<i>M(M\N) (see, e.g., Cervino & Luridiana 2006). As an ex- 
ample, the mean integrated mass of Oyvi(A1|A0, (AilN), is 
related to the mean stellar mass of the IMF, (m), through the 
relation 



{M\N} ^Nx (ni) ^Nx 



m(p{m) dm. 



(6) 



However, we note that <S>m{M\N) N x 4>{m) and that 
the actual total mass cannot be obtained, but only an esti- 
mate of it. This is the third fundamental difference with the 
distribution-by-number interpretation, which assumes that 
for a given N there is one, and only one, M value, given 
by M{N) ^ Nx (m). 

3. Relating the number of stars with the most 
massive star in the sample 

According to the law of large numbers, in a sample of N stars 
drawn from an underlying pdf, <p(m), the typical number of stars 
A^a with m > nig^ is given by A^a = A/'x p(m > m.^). Particularizing 
this equation, we can define a characteristic maximum value of 
Wmax, w^max, for which there is typically only one star with mass 
equal to or larger than m,nax through 



1 = X p(m > m,nax) = X 



(pirn) Am. 



(7) 



This is the basic 
the determination of 
sive star in a system 



[1999 



equation used by several authors as 
the actual mass of the most mas- 
(as 



examples: Elmegreen 1997 
120001: iKroupa & Weidneii l2003; Weidn er & Kroupa 2004 



200 61) ■ However, we can also obtain a mean value of mmax 
( Oey & Clarkell200 5') or a median value of mmax (IWeidner et al.l 
iSoToT So the question is: does the definition of the characteris- 
tic value mmax indeed provide the actual m„iax extreme value or 



We use here the Heaviside function as a distribution to define the 
domain of 0(m), including constraints. In this situation the value of H(0) 
is not defined, but it is assigned a posteriori to be consistent with the 
convention used in the integral limits. In the case of Eq.|3] H(0) = 0. 
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Fig. 2. Distribution of the maximum stellar mass, <l',„_^,,^(mn,ax|AO 
for different values of N. The circle on each curve is the position 
of the characteristic value ;«max- 



only an estimate of it? And if it is an estimate, what is its exact 
meaning? Let us seek the answer in a probabilistic context- 

We consider a set of N stars with unknown stellar masses, 
nii, drawn from the IMF. For any given mass m^, the probability 
of having at least one star with mass m, equal to or larger than 
OTa in the sample, !P(3; e [1, AT] |m, > ma), is the complemen- 
tary probability that all stars have a mass lower than m^, P(mi < 
ma, V/ € [l,N]). Since the stellar masses are iid drawn from the 
same distribution 0(m), the probability 'P(m/ < ma, V/ e [1,7V]) 
is the result of multiplying p(m < m^) by itself N time^ 

F(m,- < ma,V/ e [l,N]) = [p(m < m^)f = 

= [1 - p(,7t > m,)f . (8) 

Thus, 



"imax - M Statistical correlation 




,6 



Fig. 3. Percentile analysis around the median of cDm„,„(m,nax|AO 
as a function of N (shaded areas). The figure includes as a ref- 
erence the position of the characteristic value, median, mean, 
and mode of t he d istribution. Small triangles: compilation by 
Wei dner et al.l (1201 0) of observational values of m^ax and in- 
ferred values of N obtained fro m observations; squar es: ob- 
served values of A/' and m^ax from iKirk & MversI (|20TT1) : stars: 
observed values of N and m,nax in the field for the four observed 
regions from .Kirk & Mvers. (201 1.) . 



n3i e[\,N]\mi> m^) = 1 - !P(m^' < ma,V; e[l,N]) 



1 - [1 - p(m > ma)] 



N 



(9) 



This relation is valid for any value of ma and any distribution 
function. 

If we now set ma - mmax, we can replace p(m > m,nax) in 
Eq. |9]by 1/N hy virtue of the mmax definition. The probability 
that there is at least one star with m > mmax in a sample of N 
stars is thus given by 



?'(3ie[l,Ar]|m,>m^ax)= 1 



1 



1 

N 



N 



(10) 



^ The discussion in th is sec tion is mainly b ased on lSomett^ ( |2004) . 
iKendall & StuarJ <1977t) . and lGumbell ( I1958I) . although the same for- 
mulae can be found in other works. 

^ Here we use p to represent probabilities on the IMF (cf., Eqs.[T|and 
^ and T to represent probabilities on the sample with N stars. 



which has an asymptotic value 1 - 1 /e ~ 0.63 for large N values, 
with 0.63 being a reasonable approximation for, say, N > 100. 
Hence, the characteristic mass, m,nax, obtained by solving Eq. Q 
is the value of m that is not reached or exceedecQ with a prob- 
ability 0.37 in a sample of N stars. This means that in a large 
enough set of clusters, all of them with N stars, typically in 63% 
of the clusters the mass of the most massive star will be equal to 
or larger than mmax, while in 37% of the clusters it will be lower 
than m,nax- So the m„iax value obtained in Eq.|7]does not provide 
the mass mmax of the most massive star in a cluster of N stars, 
contrary to what is stated in several astrophysical papers^. 



Actually, for any possible value m^ax lower than mup that 
we would use as a proxy of the actual value of m^ax, there is 
a probability larger than 90% that the most massive star in the 
system is more massive than such m^ax value (see Appendix lAl 
for details). 
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Fig. 4. Confidence interval analysis of <£m„„('Wmax|A/^) as a func- 
tion of N (shaded area). Lines and symbols have the same mean- 
ing as in Fig. [3] 



3.1. The pdfofm^ax for a known N, <5,„„,,(mmax|A0 
Actually, there is no unique value of mmax for a total number of 



stars N, but the possible values of m-a 
the probability function 



are distributed following 



\ Jmi„, 



(m) dm 



N-l 



(11) 

(12) 



as ded u ced by iGum bell (Il958h : ISomettd (|2004: 



_^ „^ van Albadal 

(Il968h: IOev& Clarke' ~(l2005h : iMaschberger & Clarkel (l2008h: 



|Pfl amm- Altenbur g & Kroupal (12008). among others 

In Fig. |2]we show the distribution "!?„,„,„ (Wmax I A/^) for dif- 
ferent values of N. The circle on each pdf corresponds to the 
position of the characteristic value mmax. which divides the pdf 
in two areas: the left one containing the 37% of the probability 
and the right one containing the 63% of the probability. We note 
that 0„,,^^^(mmax|A/') is highly asymmetrical. Given the shape of 
the distribution, it cannot be described only by their parameters 
(mean, variance, and so on); we must consider the whole distri- 
bution for any comparison with the observational data. This can 
be done in two ways, by a percentile analysis (analysis around 
the median) and by a confidence interval analysis around the 



^ We note that, depending on the reference and the convention used 
in Sect. [21 this value can be defined either as reached or exceeded or just 
as exceeded. 

The characteristic largest value defined by Eq.|7]is related to the es- 
timation of the number of events we must record to have an event larger 
than a given value nia (which is called return period in extreme value 
theory). If the events are taken in a regular time interval, for instance, 
it could be the estimation of the number of years between earthquakes 
larger than a given magnitude, the number of years between economy 
crashes, and so on. 



model (the maximum value of the distribution, which is related 
to the most common value obtained in a set of observations). 

Figure |3] shows a percentile analysis of the distribution. The 
figure also includes the position of the mean, mode, and char- 
acteristic values of the distribution for reference. The position of 
the mean, {m,^^^\N}, mostly falls between the 63% and 84% per- 
centile, i.e., far from the median of the distribution. On the other 
hand, m^ax corresponds, as predicted, to the 37% percentile. 
Finally, the mode of the distribution lies in the lowest percentile 
range. The figure als o shows the (mmax. AO values compiled by 
IWeidner et al.l (l2010l) . in which m^ax is determined from obser- 
vations and N is inferred from star coun ting in a given mass 
rang43 It also shows the data from K irk & MversI ([201 1 ). who 
quote the observed masses of individual stars of 14 young stel- 
lar groups in four different regions (/Wmax. A/', and M were ob- 
tained from their tabulated data). We also show the correspond- 
ing ;7tniax and N valu es of field stars in each region analyzed by 
iKirk & MversI JIoTll) . which are in agreement with the general 
trend of the correlation. 

The confidence interval around the mode analysis takes into 
account the distribution shape and the range of probability of any 
region in the diagram. This is done by sorting the contributions 
to the probability in decreasing order and finding the »Jmax range 
that contains some specified amount of probability. Different 
confidence intervals are obtained by adding the sorted proba- 
bilities, taking into account their associated Wmax values. This 
methodology is extensiv ely used in the analysis of r edshifts in 
photometric surveys (see iFemandez-Soto et al.ll2002l for more 
details). The situation is illustrated in Fig.|4] which includes the 
90, 68, and 26% confidence intervals. 



3.2. The pdf of N for a known m^ax, '^//(A'l'Wmax) 



In Sect. l3.1l we discussed the estimation of Wmax. given the num- 
ber of stars N. Alternatively, we can also investigate the opposite 
case, the estimation of N from a known mmax (that is, the deter- 
mination of the 'I>(A/'|mmax) distribution). To address this prob- 
lem, we can use the Bayes' theorem: 



<l)Ar(Ar|mmax) 



<I>»w('Wmax|A0O//(Ar) 
/ ^^^^('Wmaxl^ <^n(N) dN 



(13) 



We know all terms on the right-hand side of this equation, 
except <I>7v(A/^), which is the probability of having a system with 
a given total number of stars, i.e., an initial number-of-stars-per- 
cluster function (an initial cluster number function, ICNF). If 
<l>7v(7SO is a power-law distribution in a similar fashion to the 
initial cluster mass function (ICMF), <^n(N) - AN'^ with A a 
normalization value, we find 



<i>N(N\m^,x) = A' p(m < m^ax)"^"' N''^ 
where A' is a normalization value that includes A. 



(14) 



' The analyses based on the parameters of the distribution, on the 
percentile, and on confidence intervals around the mode are equivalent 
only in the Gaussian case, where Icr is almost equivalent to the per- 
centile range 16 - 84% and the 68% confidence interval. 

Except in a few cases, Weidner & Kroupa ( 2004) and Weidn er et al.l 
( 2010) obtain N by extrapolating to the full IMF range the number of 
stars observed above a specified mass or within a specified mass 
range. Then, M is obtained by means of M = N x (m). We obtained 
the plotted N values by division of the M values quoted in their tables 
by {in}. 
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'^jM^ ~ constant 

Char Value 



26% of Probability 
68% of Probobility 
90% of Probability 



r 



/ ■ V 



0.01 



0.1 



10 



100 



Fig. 5. Confidence interval analysis of 'I>A/(A/'|winax) as a func- 
tion of m^ax for a <l);v(AO = constant. Symbols have the same 
meaning as in Fig. [3] 



The mode of <l>^(A/'|mniax), N™'^'', is obtained by equaling to 
zero its first derivative with respect to which yieldo 



\np{m < OTmax) 



(15) 



This equation has an acceptable solution only for y6 < 1; 
in particular, for a flat distribution of N (i.e., /3 - 0) the re- 
sult is approximately 1 / p(m > Wmax)- This justifies the name 
of 'Wmax as the characteristic value, since it provides yv™"'*'^ as a 
function of the most extreme value of the distribution under the 
hypothesis of a flat (S>m(NV\ In Fig.|5]we plot the confidence 
intervals of the 'l'yv(A/'|»Jmax) distribution as a function of mmax- 
We note that the axes of the plot have changed with respect to 
the figures in the previous section, si nce Wmax is now the vari- 
ate. We also plot the data points from lWeidner et al.l (l2010l) and 
iKirk & Mverl jloTlh . 

However, Eq. [15] results in a negative value without as- 
trophysical meaning if the ICNF is similar to the ICMF; 
'^ui^li^^max) is a decreasing function for all N, and the most 
probable N corresponds to the maximum of ^/^(N), i.e., the 
lower limit of the <i>/^(N) distribution. Hence, ^/^(N) modi- 
fies the confidence interval analysis of 0/^(A/'|mn,ax), as shown 
in Fig.|6] 

It seems surprising that, depending the independent vari- 
able used (mmax or N), one has to take into account <l)yv(AO- 



" N is not a continuous variable; hence it cannot have been derivated 
and yV"™'''^ must be an integer number. Thus, the formulae provide only 
an approximation. 

In Paper II we show that this assumption is implicit when N is 
inferred from the number A';, of massive stars in the (mmax, iiia) range 
by using the relation N = x p(m > m^). Similarly, the assumption 
is implicit when M is inferred by multiplying the mean stellar mass by 
N; it is a general assumption found in the literature and, in particular, 
is the method used to infer M in the lWeidner et al.l ll20IClh compilation. 
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Fig. 6. Confidence interval analysis of "t;v(A/^l'Mmax) as a func- 
tion of fflmax for a <^/^(N) oc Arrows: data points by 

IWeidner et al.l (l2010h using Nobs without correction of incom- 
pleteness due to unobserved stars. Other symbols have the same 
meaning as in Fig. |3] 



Where is the 0;^(N) dependence in Figs. [3] and |4]? Actually, we 
must be aware that Figs. |3] H] |5] and |6] are not representations 
of (A/', mmax), which would be the one to be compared 

with observational data. Instead, they are a representation of the 
probability /or^xet/ values in the x-axis, i.e., the figures can be 
only interpreted making vertical ( discrete or infinitesimal) slices. 
Hence, for comparison with data, the x-axis on Figs. |3] and |4] 
must be weighted by <1>^(A0, and the x-axis on Figs|5] and |6] 
must be weighted by <p(m). Obviously, such a weight process 
changes the probability density in the N - m^ax plane. 

3.3. Which information does the N (or M) - m^ax p/ane 
contain? 

All the quantities considered here, m^ax, N, and M, have their 
own distributions, (p{m), ^f^(N), and <1)a<(A1). So, any uncer- 
tainty of data points in the N (or M) - Wmax plane would 
be minimized or amplified by such distributions, and neither 
^m^.^(nim.^y\N) nor <l>A/(A/'|;7Jn,ax) (or their M counterparts) are 
suitable descriptions. The only suitable distribution of data 
points is given by '5m„,„,//(OTinax, ATPI (or their M counterpart, 
see below). This pdf is shown in Fig.|2]for the case of a <l>yv(A/') oc 
N^^. However, the use of ^m^^.Ni^mux, N) imposes some im- 
portant caveats. 

The first of these caveats aff^ects any test on the N (or M) - 
'Mmax correlation. Such a test can only be done at a distribution 



That is: 
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level and not in a data-point-by-data-point analysis. This means 
that we need a quantitative characterization of the uncertainty 
associated to each data point and must combine the correspond- 
ing uncertainties to obtain a density map in the N (or M) - ni^g^^ 
plane. 

The second caveat refers to the plane to be used: N - m,^^^ 
or M - OTmax? It includes two different aspects. The first is that 
any M inference implicitly includes an N inference, and in most 
of the cases (all where (m) is used), it is actually an N inference 
itself but expressed as (Al) (i.e., the plane to be used is actu- 
ally N - ;7Jmax)- The second aspect is that the distribution of data 
points in the N - m^^^ plane includes 0(m) and (^/^(N) and the 
distribution of data points in the M - m^^^ plane also includes 
<l)yvi(Al). This means that some hypothesis about the relation be- 
tween N and Ai is always required when the M - m^^y, plane is 
used. 

We conclude this section with a brief discussion about the 
falsification of the r andom sampling of the IMF claimed by 
IWeidner et al.l (l2OI0l) in view of the results presented here, that 
is, the dependence on Q>f^{N) and (^m{M) in the distribution of 
data points in the N (or M) - m^^^ plane. 

First, random sampling is an axiom in statistics and proba- 
bility. It is not a hypothesis. Statistical tests evaluate the compat- 
ibility of a hypothetical distribution with a given sample. There 
can be two main reasons for the incompatibility of both entities: 
(a) the assumed distributions are not a correct representation of 
the sample, (b) the sample is biased or not randomly chosen. In 
the present case, the hypothesized distributions are the IMF, the 
ICNF and the ICMF, where the ICMF and the ICNF are linked 
not trivially by Eq|5] We would assume a universal IMF, but still 
need an ICMF (or ICNF) characterization. The very definition of 
the ICMF (or ICNF) leads to an uncomfortable situation similar 
to the case of the IMF: we have no means of defining an empiri- 
cal sample that can be directly related to SF theories without in- 
troducing a major assumption, that is, the cluster definition. Can 
a single star be considered as a valid cluster? How do we de- 
fine a single cluster formation event in a giant molecular cloud? 
Is there a difference between the ICMF defined over a random 
set of clusters and the one defined over a group of clusters that 
would have a common origin in a larg e-scale star-f orming event? 

Hence, the results obtained by .Weidner et al.l (12010.) can be 
interpreted in different ways: 

- The clusters in the sample do not follow the assumed IMF. 

- The clusters in the sample do not follow the assumptions 
about the ICMF or ICNF 

- The sample is biased due to selection effects (including the 
definition of what a cluster is). 

- The sample is incomplete, so no conclusions about the pre- 
ceding items can be obtained. 



We will discuss these issues in more detail in Papers II and 



III. 



4. Discussion 

In the previous sections we have established the formal prob- 
abilistic interpretation of the IMF and the propagation of this 
interpretation in the correlation between m^ax and N. We can 
now explore the implications of such an interpretation and (a) 
compare it with the implications of concurrent interpretations 
(Sect. 14.11 ). and (b) discuss the random-sampling assumption of 
this work and its implications for the relation between the IMF 
and the SF (Sect.gJ]). 




Fig. 7. 3D representation of log Om^^^^jvC'Wmax, A/') distribution 
for a ^n{N) oc 



4.1. Literature on the M 
correlations 



OTmax and tlie N 



There are copious studies related to the existence a nd mod 
eling of a yVt - W max correlation (for instance, iReddish 
19781: iLarsonl 1 19821 IVa nbeveren 1982; Garcia- Varaas & Diaz 
1994; 'Garcia- Vargas et al. 1995; Elmegreen 1997, 1999, 200 
Larson 2003; Kroupa & Weidner 2003; Weidner & Krout); 
2004; Oev & Cl arkd 120051: IWeidner & Kroupa 2001' 
Parker & Goodwin 120071: ISehnan & Melnick .20081: 
Maschberger & Clarke 20081 IWeidner et al.ll2010l iKroupa etall 
201 1). Some of these articles give an explicit formulation of this 
relation, while others propose that it is a physical relation that 
links both quantities. Others even argue that the relation is not 
physical but only an effect of the size of samples. As we will 
see, the difference among the various M - mmsa. relationships 
and their meaning does not depend on the relation itself, but 
rather on how each author interprets the IMF. 

One common assumption is that the N - wimax and the 
A1-'«max correlations are theoretically equivalent. With this idea 
in min d, th e first correlation is pre f erred by Selman & Melnic^ 
(l2008l) and lMaschber^er & Clarke! (l2008l) .' who argue that JV is 
the natural independent variable for testing the r andom-sa r npling 
hypothesis. The second one is preferred by Weid ner et al.l(l2010l) 
because, with the two quantities inferred, the possible error 
in N is larger than the error in Ai. Only a few authors 



dSelman & Melnic^l2008l) explore the question of whether they 
are indeed formally equivalent or not. As we have seen previ- 
ously, in a probabilistic framework they are not equivalent (cf., 
Eq.Ell. 



4.1 .1 . The IMF as an exact analytical law 



Let us consider the case of iGarcfa- Vargas & Diazl (11994 *) and 
Garcia- Vargas et al. ( 1 995!) as an example of this interpretation. 
They assume that the IMF is not a probability distribution but an 
exact analytical law, (pcsini) - k{M) x 0(m), where k{M) is a 
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1 10 100 1000 10'* 10^ 10^ 

Fig. 8. Al - ?Mmax relationshi p resulting from the analyti - 
cal formulation of the IM F of iGarcfa- Vargas & DfazI (11994 : 
Garcia- Vargas et al.' ('1995). The figure includes data points from 
Weidner et al. (2010) and Kirk & Myers (201 1), where symbols 
have the same meanin g as in Fig. [3] and th e res ult of two lin- 
ear fits to the data from lWeidner et al.l (1201 0') and 'K irk & MversI 
(I20T1I) using either log Ai or log mmax as the independent vari- 
able. 



We note that the relevant point here is that there must be a cer- 
tain amount of mass transformed into stars with mass m < m^ in 
order to have a star with mass m^. 

A simila r Mdoud - Wmax relationship is found by iLarsonl 
(I1982L I2003h . However, Larson's results come from fitting the 
observational data of cloud masses, Mcioud, with respect to Wmax, 
and they are quoted as a statistical correlation, not a physical 
law. We note that a correlation between A^cioud and Wmax does 
not imply the same correlation between Ai and mmax. since an 
efficiency factor is required (see IShadmehri & ElmegreenllToi ll 
for a more detailed discussion). 

In Fig. [8] we show the resulting Ai - Wmax relationship un- 
der these assumptions on the IMF and assuming the functional 
form of the IMF used in t h is wo rk. T he figure includes dat a 
points from IWeidner et all (1201 Ol) and iKirk & Mverl (1201 ll) . 
We have included the resu lt of two linear fits to the data from 
IWeidner et aP (|2010) and lKirk& MversI (1201 ih using either 
log Ai or log mmax as the independent variable. The theoretical 
relation is off" toward larger log Ai values. 

This interpretation of the IMF stems from stellar counting 
procedures. Since <pGvim) is a continuous function, it cannot re- 
turn a natural number for any mass value nja; because stars 
are discrete entities, this approach can only be an approximate 
description. This alone is sufficient to invalidate Eq.[T7]as a way 
to obtain the actual most massive star, since N may (unphysi- 
cally) turn out to be a non-natural number A consequence, this 
equation can only provide an approximation. 

This situation implies that continuous functional forms of the 
IMF can only be directly related to the number of stars with a 
given mass interval, and not to the number of stars with a given 
mass. This possibility is explored in the next interpretation case. 



renormahzation constant that, because Ai is the exact value of 
the amount of gas transformed into stars, verifies 



Ai 



— I m (pcvifn) dm — k{Ai) I m <p(m) dm, 



(16) 



where (p(m) is the standard functional form of the IMF. The 
exact number of stars with mass m^ in the cluster is given by 
^a = <t>oy{ma), which implies that k{Ai) - N. Taking into 
account that stars are discrete entities, they propose a scenario 
in which only the stellar masses that verify 0gv(»i) ^ 1 repre- 
sent acceptable physical solutions (the so-called richness effect). 
Given that <pGv(m) decreases with m, the most massive star in the 
cluster is the one that verifies 



<^Gv('Wmax) ^ NX (f>{m^^^) = 1. 

For a power-law IMF, (p{m) ■ 
relationship with the form: 

'«max Al". 



(17) 

: A m^", this leads to a Al-Wmax 

(18) 



According to the scenario proposed, the cluster forms stars 
in a sorted way, in which the stars with an associated larger value 
of <pGv(m) take precedence over stars with associated lower val- 
ues of <pGvim). So, the most massive star (the one with the low- 
est 0Gv('«max) valuc) is Conditioned to the formation of a large 
enough number of lower mass star (the richness effect). Stated 
otherwise, the mass of this most massive star is determined by 
the amount of gas that remains after all possible lower mass stars 
have been formed with relative numbers established by the IMF. 



4.1 .2. The IMF as a distribution of the number of stars 

One alternative view of the IMF is that it can be arbi- 
trarily normalized and provide the exact number of stars 
in a given m a ss range. This i s the case assumed by 
Reddish' (1978"); Vanbeveren (1982): Elmegreen (1997', ' 19991 
2000): Kroupa & Weidner (2003); Weidner & K roupa (20oJ; 
Elmegreen (2006): Weidner & Kroupal (|2006|) : IWeidner et all 
(2010) and K roupa et al.. (201 1) . We refer to these articles as 
those that use the IMF de facto as a distribution of the number 
of stars. Their interpretation is that the number of stars between 
nja and m^ , with m^ < wjb, is given by 



N(m € [ma, mb]) 



'J in., 



4>E\m{m) dm. 



(19) 



where 0Eim('w) - kx <p(m) with k a normalization constant. This 
equation is the general case of Eq. |7] that is, the definition of 
'Wmax, described above. The difference with the previous case is 
that the total number of stars in the cluster is now given by 



0Eim('M) dm. 



(20) 



so, k - N. The actual total mass is given by integration 
of m X 0Eim('M) within the same mass limits. However, how 
the limits are written and what interpretation is given to them 
varies according to the author. Here we use the formalization by 
lElmegreenI ([19971 [T9991 12000[ l2006h : 



M 



I '« 0Eim('w) dm — N X I m (f){m) dm. 



(21) 
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and postpone t o the next subsubsection the di s cussion of the 
specia l case of 'Weidner & Kroupa| (|2004 l2006t) . IWeidner et all 
( I2OIOI) . and|Kroupa et al. (201 1). Whatever the normalization is, 
we need an additional assumption to obtain the actual maximum 
stellar mass in the cluster from Eq. [19] We have to assume ad 
hoc that the most massive star m^ax is the result of solving Eq.|7] 
(i.e., that m,nax is the actual m,nax)- To do so, external arguments, 
similar to the richness effect, are required. 

For a power-law IMF and mup = 00, the m,nax - Al correlation 

is 



OC M"-' OC AT"-! . 



(22) 



lElmegreenI d 1 99711 1 9991 l20QOh argue that, since the cluster is 
filled through random sampling, the i nferred mmax can o nlv be 
an estimate of the actual value. Onlv IVanbeverenI (1 1982b states 
that it is possible to obtain the actual mmax value. 

In Fig. |9]we show the resulting Ai - '«max correlation un- 
der these assumptions using the functional form of the IMF em- 
ployed here. The curve is completely equivalent to the (Al) - 
'Wmax correlation obtained in the pdf case. The figure includes 
data points from Weidner et al. (2010) and Kirk & Myers {20V^ 
just for comparison. We also included the result of a linear fit of 
log Al as a function of log m^ax obtained from the data. 

This interpretation of the IMF relies on stellar counting fol- 
lowed by a binning process. It is by far the most common in- 
terpretation and is assumed in a wide range of situations, from 
IMF determinations to stellar population synthesis. Its main fea- 
ture is that Eq. [19] provides the actual number of stars and that 
Ai-Nx {m) provides the actual total stellar mass in the cluster 
(this last feature is also shared by the analytical law interpreta- 
tion). In this case it may seem that the problem with integer num- 
bers of stars mentioned in the previous case is solved as far as we 
can always choose a suitable set of bins such that Eq.[T9lproduce 
a natural number for any m.^ and m\, values. However, the solu- 
tion is not so trivial: depending on the bin definition, distribu- 
tions with different shapes are obtained (.D'Agostino & Stephens 
119861: iMafz Apellaniz & Ubedall2005h . but the shape of the IMF 
is still defined by Nx(p{m). Consequently, the bins cannot be de- 
fined at will. The only plausible solution is to assume that Eq.[T9l 
(and hence Eg. [2Ti) is only vahd in the limiting case N - 00 
iCervifio et al.l ouesneau & Lan9onl2010l : lPiskunov et alJ 



I2OI 1 ), and that, for finite N values, they do not provide actual 
N{m E [wa, 'Wb]) or Al values but only estimates of such val- 
ues. Again, we must understand what exactly this estimate rep- 
resents. 

To summarize this section, no continuous functional fo mi of 
the IMF can provide the actual number of stars, neither for a 
given mass nor for a given mass interval, but only an estimate of 
it. The only way to give meaning to this estimate is by adopting a 
probabilistic framework. This implies using a probabilistic alge- 
bra, which explicitly prevents arbitrary normalizations of 0(m). 

4.1 .3. The Weidner & Kroupa case 

The st udies bv I Weidner &Kroupal (|2004 l2006l) : IWeidner et alj 
(1201 Ol) . and iKroupa et alJ ( 201 1 ) are another example of an in- 
terpretation of the IMF in terms of a distribution of the number 
of stars. However, they deserve special attention since they rep- 
resent a major effort to include conditions in the IMF. 

The equation s to find a A4 - m^ax relationship proposed by 
IWeidner & Kroupa ( 2004. 2006),^ on ce cor rected by an improper 
account of mmax in Al ( Kroupa et al.ll201 H) . are 



o 

s o 




max ' 
M - m„„ (WK) 

- "^mo. (Kro 2011) 



max 

log = 1 .83 log 



+ 0.42 _ . 



10 



100 



1000 

M. [Mg] 



10' 



10= 



10° 



Fig. 9. A4 - mtnax relationship resulting from the distribution 
function formulation of the IMF of Elmegreen (1997, 199^ 
'2000), the formulation of IWeidner & Kroupa (2004, 200^ 
and the optimal sampling formulation of Kroupa et al. (2011)). 
The figure inc l udes d ata points from Weidner et al.l (|2010|) and 
lKirk&Mversl(l201ll) and the result of the linear fit of the data to 
log At as a function of log mmax- 



Al - m, 



9WK' 



(m) dm. 



max - I m0wK('w)dm. 



(23) 
(24) 



As in the previous case, Eq. [23] is equivalent to the defini- 
tion of mmax given in Eq.|7]and (/>wk(»j) has the same functional 
form (scaled by a constant A:wk)- A simple inspection shows that 
^WK = N. The difference with the previous case is in Eq. [24] 
the upper limi t of th e integral is mmax and not mup. By doing so, 
Kroupa e t'al] (1201 ll) aim to constrain the IMF in such a way that 
Eq. |23]provides the actual mmax value rather than an estimate of 
it. 

They justify that Eq. |23] provides such actual value by fo- 
cusing on how the IMF is sampled. Then' first appro ach was the 
sorted sampling scenario (IWeidner & Kroupall2006h . according 
to which the IMF is sort-sampled, where the stars with the low- 
est mass are those that form first. This scenario is physically mo- 
tivated, based on the hydrodynamical simulations of cluster for- 
mation in competitive accretion without the inclusi on of possible 
(positive or negative) feedback of mas sive stars (iBonnell et al.l 
120031 12004 . IWeidner & Kroupa (2006") presented Monte Carlo 
simulations to support this model, where clusters with a given 
total mass Al are drawn from a randomly sampled IMF. The 
number of stars used in the simulation was estimated from Al 
divided by the mean stellar mass. After that, the sample is sorted 
and the desired Al value approximated by accepting or reject- 
i ng the most massiv e star in the cluster The most recent work 
(IKroupa etalj|2011h is based on the concept of the optimal sam- 
ple: sampling is optimal if Eq. [23] is verified and produces the 
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actual value of m^ax- In both cases, it is argued that the IMF is 
not random sampled. Figure |9] shows the original and the cor- 
rected M - »in,ax relationship they obtain. 

This interpretation is based on a strict vision of the IMF as a 
stellar counting process involving an individual star, the one with 
= OTmax, and a stellar counting plus binning procedure for the 
remaining N - I stars. This can be seen from the treatment of 
the integral limits or equivalently, the histograms bins, through- 
out the d ifferent versions. In the or iginal set of equations pro- 
posed bv lWeidner & Kroupal (l2006h . wimax was counted t\yice in 
two non-overlapping bins. The new version (iKroupa et al.ll201 ll) 
clearly states the bin where mmax is, but now it opens a problem 
with the 0(m) definition. We recall that it is mainly a problem 
of inclusion of conditions, which is not a trivial issue. Let us 
consider the possible self-consistent cases: 

1 . We use the criteria of equal to or larger than for lower inte- 
gral limits and lower than for upper ones to give a physical 
meaning to Eq. |23] However, if we want ffimax to appear di- 
rectly in the computation of M, we must impose it ad hoc, 
which is done by using M - nJmax instead of M. A self- 
consistent formulation, taking into account the integral lim- 
its in Eq. |23] is to write explicitly the mass contribution of 
the stars in the (m,nax, niup) range 

»Jmax = m(^WK(?M) 

(pWK(m) = 6(m - OTmax) + (N -l)x (p(m\m < m^ax), (25) 

where d{m - Wmax) is the Dirac delta function. However, 
this implies an ad hoc variation of the <p(m) functional form, 
which is necessary to impose that m^ax is the maximum stel- 
lar mass. 

2. We use the criteria of larger than for lower integral limits and 
equal or lower than for upper ones. Then, we can compute 
Ai properly using m^ax as the upper integral limit. However, 
in this case we must change Eq.l23]bv 



(26) 



- I (f>\f/j^(m)dm 

6wk(»i) = ^WK X 4>(m\m < /Wmax), 



which means that there is no star more massive than mmax- 
This means, however, that we lose the equation giving m^ax 
value, which must be imposed ad hoc. 

Cases (1) and (2) above are the only possible ones, and both 
constrain ad hoc Wmax to be the maximum stellar mass in the 
cluster Now, we have shown previously that any description of 
the IMF as a continuous function implicitly eliminates the de- 
pendence with N (and h ence Ai) and its in terpretation as a dis- 
tribution by number. The lKroupa et al.l(l201 1.) case clearly shows 
that there is no way to include constraints into a distribution-by- 
number description of the IMF and, at the same time, enjoy the 
advantages of a continuous distribution representation. Once a 
continuous functional form for 0(m) is assumed, only a pdf in- 
terpretation is valid, and we implicitly renounce obtaining actual 
values of stellar masses, actual total masses, or actual values of 
'Wmax- In particular, it would not be possible to obtain a hidden 
physical law implicit in the <p(m) functional form. At most we 
could obtain statistical correlations like the {Ai) - rh^^. If there 
were such physical laws, their origin would be external to the 



IMF and could only be inferred from detailed simulations, and 
not from algebraic manipulation of the IMF. That is the price we 
must pay for the advantages of a continuous formulation of the 
IMF 



4.1.4. The probabilistic case 



The IMF is treated as a probability distribution in'Oev & Clark^ 
(2005); Elmegreen (2006); Parker & Goodwin (200^ 
Maschberger & Clarke (12008b ; ISelman & MelnicM (12008b ; 
iHass & Anders (2010.) . among others. Their basic assumption 
is similar to the one of this paper, and some partial results 
of the description shown here have been obtained by other 
authors (including Weidneretal. 2010). Here, we summarize 
the results from works on the topic in the global context of the 
formulation given in the previous section. The common point 
of these works is that, without additional ad hoc conditions, 
an - Wmax relationship cannot be defined trivially as a 
physical law, but only as a statistical correlation. The total 
mass in the cluster, the total number of stars in the cluster, 
and the particular number of stars with given stellar masses 
are not fixed quantities, but distributed ones, and none of them 
can be obtained univocally from the others. Hence, the use of 
Ai - OTmax or the use of N - Wmax is not just a question of choice 
in terms of observational considerations; it is actually the result 
of statistical correlations of different distributions. 

The probabilistic description of the IMF is included, 
by construction, in works that make u se of Monte Carlo 
simulations (see Weidner & Krou pa^ '20061; 



ElmegreenI 



2006; Parke r & GoodwinI l2007n Selman & Melnick| 120081 
Hass&Andersll2010l as examples), where the IMF is sampled 
star by star up to a given value of Ai or N. Such Monte Carlo 
simulations have been devoted to explain and c ompare different 
results using different sampling algorithms. iHass & AndersI 
(120101) made an explicit, exhaustive, and detail e d stud y of 
the issue. As far as we know, only ElmegreenI (l2006h and 
ISelman & Melnicfl (l2008h have made theoretical studies aimed 
of describing the relationship of - 'w^ax using conditional 
probabilities. 

Most of the theoretical studies have been carried out in 
terms of an AT - m^ax relationship, using N as variate and 
minax as variable and making use of <l'«i„„(?Mmax|A/^)- They of- 
ten inclu de an express i on fo r the mean value of the dis- 
tribution ("Oev & Clarke 2005), the mode of the distribution 
(Gumbel 1958; Kendall & S tuart 1977), or the percentile anal- 
ysis ( Weidner et al .1 120101) . However, there is almost no study 
in terms of the m^ax - AT relationship n or in t he (^/^{N) 
dependence of the N - ffimax correlation (lElmeg reen 20061 
ISelman & Melnickll2008h . 

So, in the probabilistic case, the N - Wmax, Al - ni^^^, 
'«max - Af, and m^ax - Ai correlations are not equivalent to each 
other. The Ai - »imax correlation requires a <i>;^{N\Ai) distribu- 
tion which is not required by the N - '«max correlation. In addi- 
tion, establishing the /Wmax - A/^ and mmax - Ai correlations re- 
quires some priors about the distribution of (^/^(N) and (i> Mi Ai) 
that are not considered in the previous correlations. 

The probabilistic formulation offers the advantages of us- 
ing continuous distributions and including conditions formally. 
However, this does not mean that any condition can be 
represented analytically. We have mentioned above that the 
Weidner & Kroupa (2004, 2006) formulation is a major effort to 
include conditions in the IMF. Let us rewrite Eq.|25]in statistical 
terms and give a meaning to such distribution: 
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6(m — mn,ax) N — I 

0(»jNmax; = + ——4>(m\m < mmax)- (27) 



N 



N 



The above equation describes the constrained IMF/or a fixed 
Wmax value in a set of N stars. This constraint does not imply that 
a star with Wmax is present in the cluster, but just that there are no 
stars more massive than m^ax and that the event m - m^.^^ has a 
probability of 1 / A/'. Since all the arguments of the characteristic 
value hold here, the associated characteristic value is the fixed 
Wmax value, which is also a cut-off value of the distribution. So, 
63% of realizations for clusters with N stars following such pdf 
have at least one star with mass mmax (and no stars more massive 
than mmax)- 

Hence, there is no way to include in an analytical form 
the condition that the most massive star is actually Wmax and 
that such a star is present in any realization. There is also a 
similar problem with M, although the problem in this case is 
more severe since it also requires a <1'(7S0 (discrete) distribution. 
However, there is an infinite number of combinations of stellar 
masses that are consistent with any reasonable M - 'Wmax physi- 
cal law. 

The only possible solution at the moment to include a Al - 
'Mmax physical law and work with it is to perform a large set 
of Monte Carlo simulations, which should assume a particular 
(i>{N) distribution, and just consider the subset where the cho- 
sen M - OTmax physical law is verified. Then, any physical result 
must be obtained numerically (as opposed to analytically). The 
advanta ges of describing 0(m) as a continuous distribution are 
thus losO 



4.2. Sampling, iid variables, and the relation of the IMF with 
SF 

We have seen that the existence of a physical law linking Ai and 
"^max cannot be established through a simple manipulation of the 
IMF functional form, he current debate on whether the IMF is 
ra ndomly or non-randomly s ampl ed stems main l y from works 
bv lWeidner & Kroupal (12006 ') and'W eidner et all OOld). where 
rftmax is interpreted as the exact value of the most massive star 
in a cluster with a given mass. This debate has been focusing 
on different sampling proposals. Even if the authors themselves 
now con sider the sorted sam pling proposal just as a first approx- 
imation ("Kroupa e t al.ll201 ih . we want to emphasize that the key 
point of different sampling algorithms is not the sorting process, 
but the assumed relation between N and A4 (e.g., the sorted sam- 
pling proposal uses an N value estimated by means of Ai di- 
vided by {m\m < mmax), which imposes a constraint in AO- The 
situation is actually more clearly described in the ric hness effect 
propos ed by Garcia- Vargas & Diaz (1994); iGarcfa- Vargas et al. 
(Il995h : a star with mass is formed according to the amount 
of gas that remains in the system once a certain number of stars 
with w < Wa have been formed. The sampling problem appears 
when we try to fix Mini < m^) and N{m < m^) simultaneously 
and include it analytically in the (p{ni) functional form. 

As we have shown, there is no self-consistent way to do it 
with the current description of (p{ni). The inclusion of any M - 
nimsLX physical law, no matter what its interpretation is, precludes 
using an analytical functional form for the IMF. The sampling 



methods proposed by different authors are actually operational 
methods, not an implementation of the physical process 

However, we want to stress that the question on whether the 
IMF is randomly sampled or not (i.e., whether stars are iids or 
not) is completely valid, independent of the particular problem 
motivating the question. So we will not attempt to discuss this 
question in terms of any specific results from literature, but from 
a more general perspective. 

4.2.1. Identical and independent distributed variables and 
the relation of the IMF with the star formation 

The question we aim to answer is: are stellar masses iid vari- 
ables, or, at least, can they be treated as if they were? A sample 
is an iid sample if each random variable has the same identical 
probability distribution and all of them are mutually indepen- 
dent. 

Throughout the paper, we have explicitly excluded a men- 
tion to the SF physics. It is now time to take a look at different 
ways in which the SF and the IMF can be linked and how ran- 
domness enters in this game. There are several possible ways, 
(a) Some physicists prefer to assume a deterministic universe 
in which one and only one result is obtained for a given set of 
initial conditions. But there is such a large variety of initial con- 
ditions that they can be only described in a probabilistic way. 
Hence the results of SF events, like the IMF itself, can be only 
described in a probabilistic way. (b) We can also assume an uni- 
verse where determinism, although it exists, is somehow hid- 
den by complexity. Thus we assume accordingly that the SF is a 
complex process in the mathematical sense: nonlinear and with 
interconnected components, producing such a large variety of re- 
sults that they can only be treated in a probabilistic way. (c) We 
admit that there are intrinsically random variables in nature and 
that the SF is an intrinsically random process (like turbulence), 
so its r esults can only be treated in a probabilistic \yay. W e 
refe r tolShadm ehri & ElmegreenI (1201 lb : 1sinchez etal.l (l2006h : 
Elm egreerj (fl999, 2011) as examples where some of these dif- 
ferent scenarios are considered. 

The feature common to these three cases is that the IMF 
should be used probabilistically (i.e., stellar masses are ran- 
domly sampled), which does not imply that the SF is random. 
There would be no physical M and Wmax relationship at all, or 
there would be a deterministic physical law linking M and mmax- 
However, the internal distribution of stellar masses that are phys- 
ically compatible (in the SF sense) with this physical law would 
depend on a set of unknown (and variable) initial conditions or 
intrinsically random characteristics. Then the IMF could only be 
described by means of a probabilistic formulation. A probabilis- 
tic interpretation of the IMF does not contradict a deterministic 
vision of the physics of SF. 

On a large scale, the IMF is the result of all possible SF 
events and SF modes, although it does not necessarily describe 
any particular one. Following this argument, we are able to de- 
scribe probabilistically the incidence of having a star with a 



We note that any sampling proposal that aims to reproduce a 
M - OTmax physical law with a finite number of stars N is also doomed 
to this situation: it provides a 0(m,) array, but not a continuous 0(ra) 
distribution. 



The optimal sampling algorithm provided by iKroupa et al.l feo 1 Ih 
is based on obtaining bins through the larger than for lower integral 
limits and equal to or lower than for upper integral limits. These criteria 
are complementary to those underlying their equations to obtain the 
M - mmax relationship. In addition, the IMF is filled from mmax down 
to lower masses, contrary to the physical arguments given to justify the 
sorting sampling algorithm. We stress that it is not a problem of the 
formulation in as much as the physical formulation of the problem is 
not linked with the operational mathematical method used to solve the 
physical equations. 



11 



Cervino et al.: The IMF and the m,nax - M statistical correlation 



given mass that was bom at a a given time, the stellar birth rate 
S(m, f), as the composition of two independent functions: the 
star formation history, SFH il/{t, M) ( alth ough i^(f, N) would be 
more adequate) an d the IMF, 0(m) dSchmid t 1959, 1 963l:lTinslevl 
ll980tlScaloll986l) . The first function includes all the possible SF 
modes and provides the time-scale and the amount of gas trans- 
formed into stars. The second one describes how a given amount 
of gas would be distributed among different stellar masses. We 
recall that the fi rst IMF determinations were done with field stars 
(ISalpetejl955h . so they implicitly averaged a large variety of SF 
modes. 

The separation of S(m, f) into two independent functions 
seems to be a valid approach for the study of galaxies and a 
variety of systems where different modes of star formation co- 
exist; it has been extensively used in extragalactic astronomy and 
cosmology. One particular characteris tic of this approach is the 
use of single stellar populations (SSP. IRenzini & Buzzoniin"986l) 
which corresponds to i/'(f, N) - N x 6{t). Since any function 
can be described by a sum of 6{t - t) functions, it allows the 
SFH to be recovered from observational data or the evolution of 
galaxies to be described as a composition of SSPs with different 
intensity. The star format ion rate, SFR, can th en be defined as a 
time average of the SFH dda Silva et al.ll2012h or as the result of 
a flat SFH (i/'(f, AO = constant). Current S F rate indicators are 
based on SSP modeling with constant SFH (lKennicuttlll998t) . 

The case would be different if we changed the scale to 
smaller systems. When we restrict the situation to specific SF 
modes, particular details emerge and have some imprint on the 
IMF. The more restrictive the mode, the more details are present. 
In this case we are moving ourselves to particular IMF realiza- 
tions with given conditions, which may depart from the proba- 
bilistic description given by 0(m). At small scales, the validity 
of the decomposition of S(m, f) in two independent functions 
is not clear. However, the universality of the IMF even at such 
sca les leads one to think that it would be the case (however, 
see'Elmegreen'201 1'for an example of possible variations of the 
IMF, especially in the low-mass tail, depending on the environ- 
mental conditions). 

The approach we have presented here when talking about 
Sim, f) is a top-down one: 0(m) is the most generic representa- 
tion, so that the larger the syste m, the more vahd it is. We note 
that this vision is mentioned by IVanbevereiil (I1982D . who also 
claimed existence of a - m^ax physical law. Because there 
is an universal IMF at a large scale, he says, the IMF varies at 
small scale. 

In this case it is expected the IMF has a quasi universal shape 
at high scales with possible variations at small scales. Here, we 
understand that deviations from a universal shape are allowed 
as far as they are small compared to the global budget. In ad- 
dition, the incidence of deviations also depends on the size of 
the system, that is, the integral of the i/'(f, AO over time (see 
Ida Silva etal]l2012L for a discussion). 

There is also a bottom-up approach when talking about 
S(ot, t), which is the one proposed by the IGIMF theory. In 
this case, universality in the IMF functional form is assumed. 
However, there is a Al - Wmax physical law that relates M with 
nimsLX, hence there is IMF variability in the sense of a variable 
wJinax for given M. It is assumed that this physical law operates 
for all SF modes, or equivalently, that there is one SF mode: star 
formation in clusters. In this case, the mass distribution of stars 
depends on where (and when) they were formed, so only stars 
formed in the same cluster (or clusters with the same M) share 
the same IMF. 



For the study of galaxies or, in general, systems that may 
contain clusters with different masses, it is necessary to take into 
account the distribution of the total masses of these clusters: the 
ICMF. As a result, at a galactic scale there is not one IMF, but 
a IGIMF that results through the combination of the ICMF and 
different IMFs. It depends on At and i mplies a redefinition of 
the IMF itself (iKroupa & Weidned 120031) . In this case it is not 
clear if !B(m, t) can be sep arated into independent functions and 
how dCerviiio et al.l20lll) . This implies major revisions of global 
galactic and extragalactic studies, including the SSP concept 
and there is currently a large debate on the issue (ICorbelli et al.1 
2009; FumagalH et al."201 iVEldridge 2012). Although a fufl dis- 
cussion goes beyond the scope of this paper, we want to point out 
that there would be a (Al) - Wmax physical law, although it must 
be imposed ad hoc, and that, whatever the case, random sam- 
pling and a probabilistic description of the IMF are compatible 
with it. 

5. Conclusions 

Having carried out a thorough analysis of different IMF inter- 
pretations, with a focus on the question of how information on 
m^ax can be extracted from the IMF itself, we are in position 
to formulate the problem in a different way: What information 
does the IMF contain? Can we extract information on the SF 
process from an algebraic manipulation of the IMF? The an- 
swers to these questions are driven by the interpretation of the 
IMF adopted by each author and, in particular, their conclusion 
as to whether, without direct observations, m,nax can be exactly 
determined or just estimated. 

Our analysis of the problem has led us to the following main 
conclusion: Only a probabilistic interpretation of the IMF, where 
4>{m) is a pdf (ruling out arbitrary normalizations) and stellar 
masses are random sampledly iid variables, provides a physi- 
cal and mathematical self-consistent formulation that explains 
the (Al) - /Wmax statistical correlation obtained from IMF alge- 
braic manipulation. We also give plausible arguments that intro- 
duce the IMF as a probabilistic distribution when related with 
the physics of the star formation process. 

Additional conclusions of this work are: 

1 . The actual total stellar mass of a cluster, Al, cannot be in- 
ferred from an IMF, 0(m), with a continuous functional form. 
A direct IMF integration only provides its mean value, (A4), 
for a given number of stars N: 

(A4) = Nx{m} = Nx m (f)(m) dm. (28) 

Although so me authors do not con sider AT as a relevant phys- 
ical variable dKroupa et al.ll20lTh . the fact that stars are dis- 
crete entities and AT is a natural number are relevant physical 
constraints that must be included in the treatment of the IMF 
and in the algebra used to obtain physical results from it. 

2. Given the equation defining the most massive star in a sys- 
tem, 

-- = 4>{m) dm, (29) 

the resulting (Al) - m^ax is practically independent of 
the specific IMF interpretation adopted. However, how this 
equation is understood strongly depends on the framework 
of the interpretation. 
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M statistical correlation 



3. In a probabilistic interpretation, Ea.l29]provides a character- 
istic mass, m,nax, that is, the value of m that is not reached or 
exceeded with a probability 0.37 in a sample of N stars, but 
not the actual mass of the most massive star in the sample. 

4. For any m,nax Z lOM© and not close to mup, there is a prob- 
ability larger than 90% that the most massive star in the sys- 
tem is larger than such /Wmax value. Therefore, assuming that 
Eg. |29l provides the actual mass of the most massive star in 
the cluster, as argued in the framework of different interpre- 
tations of the IMF, is an at/ hoc assumption and not a physical 
fact. 

5. mmax defines the mode of the distribution cE>yv(A/'|minax) of 
the possible N values inferred from the most massive star 
in the cluster assuming a flat (!>;^(N) distribution. A similar 
dependence in <i>j^{N) is present when M is inferred from the 
number of the A^a most massive stars in the cluster (cf.. Paper 
II). However, the observational evidence is that (^/^(N) is a 
power law (if it is related with the ICMF). 

6. When the total cluster mass is inferred through the equation 
(M) = N X (m) and N is obtained assuming a flat 0/^(N), 
the observational data become consistent with a m^ax - (Al) 
statistical correlation. This is indeed the case when 0/^(N) is 
not taken into account explicitly in the N (and Ai) estimation 
(as found in most of the cluster in the IWeidner et al.ll2010l 
sample). 

7. The meaningful distribution to be tested against observa- 
tional data is 'i>m„,„,u(mm!ix, and not 'l'yv(A/'|/«max) or 

a>,„-,..,(Wmnxl^). 



8. IWeidner et al.l (1201 Ot) claim that the results of their analy- 
sis falsify the hypothesis of a random sampling of the IMF. 
Based on the two preceding points, we consider that such 
claim should be revised, both because of the Ai values it re- 
lies on and because of the methodological choice of using 

fl'«i„,„('Mmax|A^)- 

9. Different sampling algorithms proposed in the literature are 
not physical requirements, but convenient mathematical al- 
gorithms that try to simplify the implications of such phys- 
ical law on studies where the IMF is used (as is the case of 
stellar population in galaxies). Unfortunately, such simplifi- 
cation is not possible. 

10. We cannot exclude that a hard physical law linking Ai to 
wJmax (the actual values) does indeed exist; but, if this is the 
case, it must arise from considerations of the problem in- 
cluding a full-fledged SF analysis, which cannot be shortcut 
through algebraic IMF manipulations. Whatever the case is, 
the existence of such an Al-mmax physical law is compatible 
with random sampling of stellar masses and a probabilistic 
interpretation of the IMF. 

11 . If such a physical law exists, it cannot be incorporated to 
an analytical IMF functional form, but must rather be ap- 
proached by computing Monte Carlo simulations and taking 
into account only the subset of simulations that verify the as- 
sumed Ai - OTmax physical law. We note that this approach 
is fully c ompatible with the o ptimal sampling definition pro- 
vided bv lKroupa etaP (l201lh . 

We conclude that a random sampling IMF is not in contra- 
diction to a possible mmax -Ai physical law. However, such a law 
cannot be obtained from IMF algebraic manipulation or included 
analytically in the IMF functional form. The possible physical 
information that would be obtained from the N (or Ai) -'«max 
correlation is closely linked with the <l)yvi(A4) and <i>/^(N) dis- 
tributions; hence it depends on the SF process and the assumed 
definition of stellar cluster In a second paper of this series we 



J, 



^(m) for 0(nn) 



P(m in [m,m+1]) for 0Cm) 



0.01 



0.1 



10 



100 



[Mq] 



Fig. A.l. Intensity function /i(m) as a function of m for the IMF. 
The figure also shows the probability that m will be in the range 
(mb, mb + IM0). 



will explore the application of the probabilistic description of 
the IMF formulated in this study. Particularly, we will describe 
how to use it to make inferences about quantities that charac- 
terize some stellar systems, and how observational constraints 
work as a priori conditions, affecting the sampling distributions 
of Ai and N that we can infer. 



Appendix A: The intensity function 

As stated in Sect. [3] (p(m) cannot provide a value of mmax that 
can be used as the actual maximum stellar mass in a hypothetical 
cluster Still, we can calculate the probability for the actual value 
of '«max to be close to the mean, the median, the characteristic 
value, or the mode of O,„^„(OTniax|A0- In general, we can evaluate 
the probability that a value known to be larger that is smaller 
than nib + dnii,. To do that, we need to introduce the intensity 
functiorP^ l^ini^): 



(mb)dmb 



1 - p{m < mb) 



> <p(m\,) dwb- 



(A.1) 



The intensity function is not a pdf; it is independent of N, as 
implicit in the idd variable hypothesis: the probability of obtain- 
ing a value equal to or larger than 5 throwing one dice is 2/6, in- 
dependently of previous throws. This must not be confused with 
the case we studied in the previous paragraphs, which would be 
equivalent to the probability of obtaining at least one throw with 
a result equal to or larger than 5 in N draws. 

In Fig. lA.ll we plot the intensity function for different values 
of mb for the case of the IMF used in this work. The figure also 
shows the probability that a star known to have m > mb will be 



We use /j(m) to follow the notation used bv lGumbel l ll958h . It must 
not be confused with the definition of the mean value that is used in 
other papers. 
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in the range [wb, rn\, + IM©). The figure shows that /i(wb) has a 
minimum at a value close to m^p, and it goes to infinity at mup. 
The probability of m in the range [njb, m\, + IM©] decreases with 
mb, except for values close to mup. For example, there is only a 
chance lower than 10% that, given a star in the m\, - m^p range, 
this star has a mass nib for m\, > IOMq. The situation changes 
in the extreme case in which m\, is close to mup: if we know that 
there is one star with mass mup or larger, the mass must certainly 
be mup (i.e., probability equal to 1), since stars with mass larger 
than mup do not exist. 

This has an interesting implication for the statement that 
'Wmax actually provides the mass of the most massive star in the 
cluster; assuming that there is one star equal to or more massive 
than mmax and that rhianx ^ lOM© and is not close to mup, there 
is a probability larger than 90% that the most massive star is 
more massive than mmax' 
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